Cox regression modeling of survival after chemotherapy for colon cancer

Author

Hermela Shimelis

Published

December 1, 2024

I. Introduction

The Cox Proportional Hazards (PH) model is one of the most widely used regression models used for survival analysis, also known as time-to-event analysis. The model investigates the association between survival time and one or more predictors. The Cox PH model is based on the hazard function, which is defined as the probability that an individual will experience the event at a given time, provided that the event has not occurred yet [1]. The outcome variable is the survival time, which is measured from a defined time until the occurrence of an event, such as death, or the end of the study period [1]. For each subject, there is time to event (T), which is the duration from a defined starting point to the event occurrence, or a censoring time (C), which is the duration until the subject drops out of the study or the study ends.

The Cox Proportional Hazards (PH) model relies on two main assumptions: 1) random censoring, and 2) the proportional hazard assumption [2]. The random censoring assumption is satisfied when patients censored in the study are a random sample from the study population. There is no statistical test for this assumption, but it can be ensured through rigorous experimental design. The proportional hazard assumption requires that the hazard function (hazard ratio) for two groups (e.g., experimental arm and standard arm) remains proportional, meaning the hazard ratio is constant over time. This assumption is violated if a covariate’s impact on the outcome changes over the follow-up period, a common occurrence in biochemical research. When the proportional hazard assumption is violated, various methods can address this, including stratifying the model based on variables that violate the assumption. This stratification allows the baseline hazard to differ across strata [3]. Another method involves including interactions between covariates and time, enabling hazard ratios to vary over time [4]. While baseline hazards are allowed to differ, the effects of other covariates are presumed consistent across strata. A limitation of this approach is that the effect of the stratification variable cannot be tested, as it is not included as a covariate in the model. Other statistical methods to accommodate non-proportional hazards are detailed and discussed elsewhere [5].

One of the properties of the Cox PH model contributing to its popularity is that the baseline hazard, i.e., \(h_0\), is an unspecified field, which makes it a semi-parametric model. It is robust and can provide reliable estimates without specifying the baseline hazard function. Even though the baseline hazard is not known, it is possible to estimate the coefficients in the exponential part of the equation. Hence, the effect of variables included in the model can be estimated [6]. Another feature of the Cox PH model is its interpretability. The effect of covariates on the hazard is represented by hazard ratios, i.e., the relative likelihood of an event happening in the experimental group with respect to the standard group.

The Cox PH model is widely used across various fields, including medical research, epidemiology, business, engineering, and social sciences [6]. It is commonly used to determine survival rates among cancer patients with different subtypes of cancer [7], or between those treated with different treatments [8]. The Cox PH model has also been used effectively in non-medical research, for example, in the prediction of time to loan defaults [9], customer time to churn [10], product lifespan and failure time analysis in engineering [11], and for the identification of factors that influence the time it takes to achieve certain rates of success, such as vaccination rates, in public health [12].

The current analysis evaluated data from one of the first successful trials of adjuvant chemotherapy for colon cancer. The colon cancer patients in the study had their primary treatment of surgery and the objective was to test whether treatment with either Levamisole or Levamisole in combination with Fluorouracil (5-FU) chemotherapeutic agents improves survival. Levamisole is a low-toxicity compound previously used to treat worm infestations in animals; 5-FU is a moderately toxic chemotherapy agent. There are two records per person, one for recurrence and one for death. The purpose of this project is to compare survival between the untreated (Obs) group vs. those treated with amisole (Lev), or amisole + 5-FU.

The current analysis evaluated modified data from one of the first successful trials of adjuvant chemotherapy for colon cancer. The study involved colon cancer patients who had undergone primary surgical treatment, and the objective was to test whether treatment with either Levamisole alone or Levamisole combined with Fluorouracil (5-FU) chemotherapeutic agents improves survival. Levamisole is a low-toxicity compound previously used to treat worm infestations in animals; 5-FU is a moderately toxic chemotherapy agent. There are two records per person, one for recurrence and one for death. The purpose of this project is to compare survival between the untreated (Obs) group vs. those treated with amisole (Lev), or amisole + 5-FU.

II. Methods

The Cox proportional hazards model was used to model the relationship between survival time and different colon cancer treatments. In particular, the survival time between the untreated group (observation) and those treated with amisole (Lev) or amisole + 5-FU was compared. The Cox regression model was chosen for this study because it is well-suited for studying the association between survival time of patients and predictors, and it allows estimating the risk or hazard of death increased or decreased due to each treatment relative to no treatment. The time (in days) until the event, i.e., death, was modeled as a function of treatment and other variables, including age, sex, and various tumor characteristics. Significant predictors were included in the final model.

IIa. Data Source

The colon cancer cancer dataset is a built-in dataset in the Survival R package [13] [14]. It is closely aligned with to the original study published in 1991 [15]. The dataset includes 929 subjects with stage B or C colon cancer, who were randomized into three treatment groups: Observation, Levamisole (Lev), or Levamisole + 5-FU. While the dataset contains survival and recurrence information for these individuals, the current analysis focused on time to death, so rows representing recurrence time were filtered out. Time to death is given in days. The median follow-up period was 1,976 days (5.4 years), with a range of 23-3,329 days. The dataset includes various patient characteristics, such as demographics and tumor details, as well as the duration from surgery to trial registration, categorized as either short or long.

Table 1: Description of variables included in the dataset

id: id
study: 1 for all patients
rx: Treatment - Obs(ervation), Lev(amisole), Lev(amisole)+5-FU
sex: 1=male
age: in years
obstruct: obstruction of colon by tumour
perfor: perforation of colon
adhere: adherence to nearby organs
nodes: number of lymph nodes with detectable cancer
time: days until event or censoring
status: censoring status
differ: differentiation of tumour (1=well, 2=moderate, 3=poor)
extent: Extent of local spread (1=submucosa, 2=muscle, 3=serosa, 4=contiguous structures)
surg: time from surgery to registration (0=short, 1=long)
node4: more than 4 positive lymph nodes
etype: event type: 1=recurrence,2=death

IIb. Exploratory data analysis

The data was evaluated for missing values, duplicate entry, and outliers. The Survminer [16] package was used to plot Kaplan-Meier curves to visualize survival probability over time for each of the categorical variables. Multicolinearity was tested using Variant Inflation Factor (VIF) calculated using MASS package [17].

IIc. Statistical analysis

The R statistical software version 4.3.2 [18] was used for all analysis. The Survival package was used to construct the Cox regression model [13] [14].

Cox regression model is based on the hazard function \(h_x(t)\) with covariates at time t given by:

\(h_x(t)=h_0(t)\exp(\beta_1x_1 +\beta_2x_2 + \dots + \beta_p x_p)\)

Where:

\(h_x(t)\) is the hazard function

\(h_0(t)\) is the baseline hazard function

\(\beta_1x_1 + \beta_2x_2 + \dots +\beta_p x_p\) represent the linear combination of covariates and their coefficient

The hazards for the observation vs. amisole (Lev), or amisole + 5-FU group with covariate values x1 and x2 are given by: \(hx_1(t)=h_0(t)\exp(\beta_1x_1)\) and \(hx_2(t)=h_0(t)\exp(\beta_2x_2)\), respectively

The hazard ratio is expressed as: HR = \(hx_2(t)\) / \(hx_1(t)\) = \(\exp[\beta(x_2-x_1)]\)

The R MASS package was used for Step-wise variable selection, using “both” forward and backward variable selection [17]. For Step-wise selection, stepAIC() function uses AIC (Akaike Information Criterion) as the measure to add or remove predictors from the model.

IId. Model evaluation

Diagnostic for proportional hazard assumption

The Schoenfeld residual plot was constructed to test Cox proportional hazards assumption. When the proportional hazards assumption was not met for any of the covariates, stratification approach was used.

K-fold cross-validation

Model performance was evaluated using 5-fold cross-validation using the boot package [19] [20].

Model selection

The Concordance (c-index), AIC, and Baysian Information Criteriod (BIC) of corresponding models were compared to select the best fit model. The model with the lowest AIC and BIC values and highest concordance was selected as the final model.

III. Analysis and Results

IIIa. Data cleaning and feature engineering

The dataset contained missing values in the number of nodes (nodes) and differentiation (differ) columns. Missing values in the differentiation and nodes columns were replaced with the mode and median, respectively. Categories representing different groups in the differentiation, local spread, and time to surgery (surg) columns were replaced with descriptive names. Evaluation of the nodes and node4 variables showed that samples with more than four lymph nodes (node4) had fewer than five counts in the nodes column, indicating inconsistency between the two columns. Therefore, the nodes column was not used for further analysis.

IIIb. Exploratory data analysis

Figure 1: Exploratory data analysis

Exploratory data visualization (Figure 1) showed that the study participants were equally randomized into three treatment groups, with approximately 33% of individuals in the observation, Levamisole (Lev), or Lev + 5-FU groups. Fifty-two percent of the participants were male. The dataset is balanced with respect to the outcome (status), with 51% censored and 49% having died before the end of the follow-up period.

Regarding tumor characteristics, obstruction of the colon by the tumor (obstruct), perforation of the colon (perfor), and tumor adherence to nearby organs (adhere) were observed in 19%, 3%, and 15% of the patients, respectively. About 27% of the participants had more than four tumor-positive lymph nodes (node 4) and a long time from surgery to registration (surg). Overall, most categories of tumor characteristics are well represented, with about 15% of the participants having these characteristics, except for perforation, differentiation, and local spread.

Age is normally distributed. Evaluation of time to death or censoring showed a bimodal distribution. Stratifying the time variable by outcome (status) revealed that the majority of the patients died within three years of the study period.

Table 2: Patient characteristics

Observation (%) Amisole (%) Amisole + 5-FU (%)
N=315 N=310 N=304
Demographics
Male 166 (52.3) 177 (57.1) 141
Median age (years) [IQR] 60 [53,68] 61 [53,69] 61 [52,70]
Cancer characteristics
Colon obstruction 63 (20.0) 63 (20.3) 54 (17.8)
Colon perforation 9 (2.9) 10 (3.2) 8 (2.6)
Adherence to nearby organs 47 (14.9) 49 (15.8) 39 (12.8)
Differentiation of tumor
Well 27 (8.6) 37 (11.9) 29 (9.5)
Moderate 236 (74.9) 229 (73.9) 221 (72.7)
Poor 52 (16.5) 44 (14.2) 54 (17.8)
Extent of local spread
Contiguous 20 (6.3) 12 (3.9) 11 (3.6)
Muscle 38 (12.1) 36 (11.6) 32 (10.5)
Serosa 249 (79.0) 259 (83.5) 251 (82.6)
Submucosa 8 (2.5) 3 (1.0) 10 (3.3)
More than 4 lymph nodes with cancer Yes 87 (27.6) 89 (28.7) 79 (26.0)
Short time from surgery to registration (%) Yes 91 (28.9) 80 (25.8) 76 (25.0)

As shown in Table 2, the baseline patient characteristics were very similar between the three treatment groups, suggesting that differences in survival time between the groups are not likely attributed to differences at the start of the study.

IIIc. Survival curves

Figure 2: Censoring (left) or survival (right) time of patients

The median survival time of patients who were censored (left) or died (right) was 2,331 and 775 days, respectively. These results show that patients who were not censored died early during the follow-up period.

Stratified survival curves

#| echo: false
#| message: false
#| warning: false
#| include: true


f1 <- survfit(Surv(time, status) ~ adhere, data = df)
f2 <- survfit(Surv(time, status) ~ rx, data = df)
f3 <- survfit(Surv(time, status) ~ sex, data = df)
f4 <- survfit(Surv(time, status) ~ obstruct, data = df)
f5 <- survfit(Surv(time, status) ~ perfor, data = df)
f6 <- survfit(Surv(time, status) ~ node4, data = df)
f7 <- survfit(Surv(time, status) ~ differentiation, data = df)
f8 <- survfit(Surv(time, status) ~ local_spread, data = df)
f9 <- survfit(Surv(time, status) ~ surg, data = df)

fits <- list(adhere = f1, rx = f2, sex =f3, obstruct = f4 , perfor=f5 , node4 =f6 , differentiation =f7 , local_spread =f8 ,  surg= f9)

# Visualize

legend.title <- list("adhere", "rx", "sex", "obstruct", "perfor", "node4", "differentiation", "local_spread", "surg")
ggsurvplot_list(fits, df,  pval=TRUE) #legend.title = legend.title,
$adhere


$rx


$sex


$obstruct


$perfor


$node4


$differentiation


$local_spread


$surg


attr(,"class")
[1] "list"            "ggsurvplot_list"

Figure 3: Survival curves stratified by categorical variables

There is a statistically significant difference in survival times between two or more groups being compared, as shown in the survival curves stratified by adherence to nearby organs, rx, obstruction, presence of four or more tumor-positive nodes, differentiation, local spread, and surgery. Survival time is not significantly different between males and females or among those with or without tumor perforation. The vertical lines on the Kaplan-Meier curve, which show censoring data points, are enriched after about 1,800 days, suggesting that many participants were lost to follow-up after this period.

Patient characteristics that showed significant differences in survival time between two categories are important predictors of survival time. However, the Kaplan-Meier curve does not adjust for other variables; therefore, some of the predictors may not be significant when included in the multivariate Cox regression model.

IIId. Cox regression models

Model_1: Base Model

Model_1 <- coxph(Surv(time, status) ~ 1, data = df)
c_index_model_1 <- concordance(Model_1)
cat("Concordance of the base model:",c_index_model_1$concordance)
Concordance of the base model: 0.5

Model_2: Univariate analysis

# Univariate analysis
Model_2 <- coxph(Surv(time, status) ~ rx, data = df)


# Create the regression table and add concordance statistic
summary_table <- tbl_regression(Model_2, exponentiate = TRUE) %>%
  add_glance_source_note(
    label = list(concordance = "Concordance"),
    include = c("concordance")
  ) %>%
  modify_table_styling(
    columns = p.value,
    rows = p.value < 0.05,
    text_format = "bold"
  )

# Convert to gt table, increase font size, and adjust width
gt_table <- as_gt(summary_table) %>%
  gt::tab_options(
    table.font.size = px(17),  # Increase font size
    table.align = "center",      # Align table to the left
    table.width = pct(50)     # Make the table wider
  ) %>%
  gt::tab_header(
    title = gt::md("**Table 3. Model_2: Univariate model**")
  )

# Print the gt table
gt_table
Table 3. Model_2: Univariate model
Characteristic HR1 95% CI1 p-value
rx


    Obs
    Lev 0.97 0.78, 1.21 0.8
    Lev+5FU 0.69 0.55, 0.87 0.002
Concordance = 0.536
1 HR = Hazard Ratio, CI = Confidence Interval

The coefficient of Lev is not significant, suggesting that there is no evidence that this treatment affects survival time compared to observation. however Lev+5Fu is significant (p=0.00175), indicating that the treatment Lev +5Fu has a statistically significant effect on survival time compared to the reference group. The negative sign indicates that this treatment group has a lower hazard and likely a longer survival time. The hazard ratio for Lex+5FU (0.690), indicating the risk of death is about 31% lower compared to the observation group.

Model_2: Cox proportional hazard assumption test

zph_test <- cox.zph(Model_2)

# Convert the Schoenfeld residuals test results to a data frame
zph_df <- as.data.frame(zph_test$table)
zph_df$Variable <- rownames(zph_df)

zph_df <- zph_df %>%
  mutate(
    chisq = round(chisq, 3),
    p = round(p, 3)
  )

zph_df %>%
  kbl(caption = "Table 4: Schoenfeld Residuals Test Results") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = FALSE, position = "center") %>%
  row_spec(0, bold = TRUE, align = "center")
Table 4: Schoenfeld Residuals Test Results
chisq df p Variable
rx 1.481 2 0.477 rx
GLOBAL 1.481 2 0.477 GLOBAL
# plot the Schoenfeld residuals
plot(zph_test)

Figure 4: Schoenfeld residal plot of rx variable included in Model_2

The Schoenfeld residal plot shows that the residuals are scattered randomly and the smooth trend line is horizontal near 0. This suggests that the hazard ratio for rx (treatment status) is constant over time and the proportional hazard assumption is met. The global p-value is >0.05, indicating that the the assumption is met.

Model_3: All predictors

# Full variables: All predictors
Model_3<- coxph(Surv(time, status) ~ rx+ age + sex + perfor + adhere + surg + obstruct + differentiation + node4+ local_spread, data = df)

summary_table <- tbl_regression(Model_3, exponentiate = TRUE) %>%
  add_glance_source_note(
    label = list(concordance = "Concordance"),
    include = c("concordance")
  ) %>%
  modify_table_styling(
    columns = p.value,
    rows = p.value < 0.05,
    text_format = "bold"
  )

gt_table <- as_gt(summary_table) %>%
  tab_options(
    table.font.size = px(17),  # Reduce font size
    table.align = "center",    # Align table to the center
    table.width = pct(50),     # Make the table wider
    data_row.padding = px(2)   # Reduce row padding
  ) %>%
  gt::tab_header(
    title = gt::md("**Table 5. Model_3: All predictors**")
  )

gt_table
Table 5. Model_3: All predictors
Characteristic HR1 95% CI1 p-value
rx


    Obs
    Lev 0.98 0.79, 1.22 0.9
    Lev+5FU 0.69 0.54, 0.87 0.002
age 1.01 1.00, 1.02 0.083
sex 1.04 0.86, 1.26 0.7
perfor 1.00 0.59, 1.70 >0.9
adhere 1.18 0.92, 1.53 0.2
surg 1.27 1.03, 1.55 0.022
obstruct 1.33 1.06, 1.68 0.015
differentiation


    moderate
    poor 1.43 1.13, 1.82 0.003
    well 1.08 0.78, 1.50 0.6
node4 2.55 2.10, 3.09 <0.001
local_spread


    contiguous
    muscle 0.39 0.23, 0.64 <0.001
    serosa 0.64 0.43, 0.94 0.023
    submucosa 0.29 0.10, 0.83 0.021
Concordance = 0.674
1 HR = Hazard Ratio, CI = Confidence Interval

The summary of Model_3 shows that when all variables are included in the model, age, sex, perforation, and adherence are not significant predictors. In contrast, rx, surg, obstruct, differentiation, node4, and local spread are significant predictors. The concordance of the multivariable model (0.674) is higher than that of the univariate model (concordance = 0.53), suggesting that the multivariate model is a better fit. However, the model concordance did not improve when removing predictors that were not significant in Model_3.

Stepwise variable selection:

A step-wise variable selection approach was used to confirm that the significant predictors in Model_3 are the best set of predictors. The stepAIC() function from the MASS package was used for stepwise variable selection, utilizing the Akaike Information Criterion (AIC) to add or remove predictors from the model.

# Model_2 includes all variables
Model_2 <- coxph(Surv(time, status) ~ rx+ age + sex + perfor + adhere + surg + obstruct + differentiation + node4+
              local_spread, data = df)  

# stepwise selection
stepwise_model <- stepAIC(Model_2, direction = "both")
Start:  AIC=5741.4
Surv(time, status) ~ rx + age + sex + perfor + adhere + surg + 
    obstruct + differentiation + node4 + local_spread

                  Df    AIC
- perfor           1 5739.4
- sex              1 5739.6
- adhere           1 5741.0
<none>               5741.4
- age              1 5742.5
- surg             1 5744.5
- obstruct         1 5745.1
- differentiation  2 5745.4
- rx               2 5749.9
- local_spread     3 5753.1
- node4            1 5822.0

Step:  AIC=5739.4
Surv(time, status) ~ rx + age + sex + adhere + surg + obstruct + 
    differentiation + node4 + local_spread

                  Df    AIC
- sex              1 5737.6
- adhere           1 5739.1
<none>               5739.4
- age              1 5740.5
+ perfor           1 5741.4
- surg             1 5742.5
- obstruct         1 5743.2
- differentiation  2 5743.5
- rx               2 5747.9
- local_spread     3 5751.1
- node4            1 5820.1

Step:  AIC=5737.59
Surv(time, status) ~ rx + age + adhere + surg + obstruct + differentiation + 
    node4 + local_spread

                  Df    AIC
- adhere           1 5737.3
<none>               5737.6
- age              1 5738.6
+ sex              1 5739.4
+ perfor           1 5739.6
- surg             1 5740.7
- obstruct         1 5741.2
- differentiation  2 5741.7
- rx               2 5746.2
- local_spread     3 5749.3
- node4            1 5818.2

Step:  AIC=5737.26
Surv(time, status) ~ rx + age + surg + obstruct + differentiation + 
    node4 + local_spread

                  Df    AIC
<none>               5737.3
+ adhere           1 5737.6
- age              1 5738.6
+ sex              1 5739.1
+ perfor           1 5739.2
- surg             1 5740.7
- obstruct         1 5740.9
- differentiation  2 5742.1
- rx               2 5746.0
- local_spread     3 5750.4
- node4            1 5817.4
# Extract the coefficients of the selected model
selected_variables <- coef(stepwise_model)

# Print the selected variables
print(selected_variables)
                rxLev             rxLev+5FU                   age 
         -0.010789186          -0.375746378           0.007365529 
                 surg              obstruct   differentiationpoor 
          0.243870576           0.283164609           0.373782987 
  differentiationwell                 node4    local_spreadmuscle 
          0.069021620           0.929854405          -0.995556083 
   local_spreadserosa local_spreadsubmucosa 
         -0.501168855          -1.322018421 

Based on a step-wise selection, rx, age, surg, obstruct, differentiation, node4 and local spread were selected for inclusion in the final model. Age is a significant predictor in the step-wise selected variables, but not in the full variable model (Model_2).

Model_4: Stepwise-Selected variables

Model_4 <- coxph(Surv(time, status) ~ rx + age + surg + obstruct + 
    differentiation + node4 + local_spread, data = df)

summary_table <- tbl_regression(Model_4, exponentiate = TRUE) %>%
  add_glance_source_note(
    label = list(concordance = "Concordance"),
    include = c("concordance")
  ) %>%
  modify_table_styling(
    columns = p.value,
    rows = p.value < 0.05,
    text_format = "bold"
  )

gt_table <- as_gt(summary_table) %>%
  tab_options(
    table.font.size = px(17),  # Reduce font size
    table.align = "center",    # Align table to the center
    table.width = pct(50),     # Make the table wider
    data_row.padding = px(2)   # Reduce row padding
  ) %>%
  gt::tab_header(
    title = gt::md("**Table 6. Model_4: Stepwise-selected variables**")
  )

gt_table
Table 6. Model_4: Stepwise-selected variables
Characteristic HR1 95% CI1 p-value
rx


    Obs
    Lev 0.99 0.80, 1.23 >0.9
    Lev+5FU 0.69 0.54, 0.87 0.002
age 1.01 1.00, 1.02 0.069
surg 1.28 1.04, 1.56 0.018
obstruct 1.33 1.06, 1.67 0.015
differentiation


    moderate
    poor 1.45 1.15, 1.84 0.002
    well 1.07 0.77, 1.48 0.7
node4 2.53 2.09, 3.07 <0.001
local_spread


    contiguous
    muscle 0.37 0.23, 0.61 <0.001
    serosa 0.61 0.41, 0.89 0.010
    submucosa 0.27 0.09, 0.76 0.014
Concordance = 0.672
1 HR = Hazard Ratio, CI = Confidence Interval

The concordance of Model_4 did not improve compared to Model_3, which included all predictors. However, Model_4 is less complex as it includes a smaller number of variables (see model comparison table below).

Model_4: Cox proportional hazard assumption test

 # final model with step wise variable selection
zph_test <- cox.zph(Model_4)

# Convert the Schoenfeld residuals test results to a data frame

zph_df <- as.data.frame(zph_test$table)
zph_df$Variable <- rownames(zph_df)

zph_df <- zph_df %>%
  mutate(
    chisq = round(chisq, 3),
    p = round(p, 3)
  )

zph_df %>%
  kbl(caption = "Table 7. Schoenfeld Residuals Test for Model_4") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = FALSE, position = "center") %>%
  row_spec(0, bold = TRUE, align = "center")
Table 7. Schoenfeld Residuals Test for Model_4
chisq df p Variable
rx 2.335 2 0.311 rx
age 0.549 1 0.459 age
surg 0.020 1 0.888 surg
obstruct 6.148 1 0.013 obstruct
differentiation 17.459 2 0.000 differentiation
node4 5.662 1 0.017 node4
local_spread 7.083 3 0.069 local_spread
GLOBAL 37.525 11 0.000 GLOBAL

Model_4 does not meet the Cox proportional hazards assumption, as indicated by p-values less than 0.05 in the Schoenfeld residuals test. The differentiation, node4, and obstruct variables did not meet the proportional hazards assumption, suggesting that the effects of these variables are not constant over time. To address this, a stratification approach will be used, where the model will be stratified by the variables that did not meet the proportional hazards assumption.

Model_5: Stratified Model

Model_5 <- coxph(Surv(time, status) ~ rx + age + surg + strata(obstruct) + strata(differentiation) + node4 +
              local_spread, data = df)

summary_table <- tbl_regression(Model_5, exponentiate = TRUE) %>%
  add_glance_source_note(
    label = list(concordance = "Concordance"),
    include = c("concordance")
  ) %>%
  modify_table_styling(
    columns = p.value,
    rows = p.value < 0.05,
    text_format = "bold"
  )

gt_table <- as_gt(summary_table) %>%
  tab_options(
    table.font.size = px(17),  # Reduce font size
    table.align = "center",    # Align table to the center
    table.width = pct(50),     # Make the table wider
    data_row.padding = px(2)   # Reduce row padding
  )%>%
  gt::tab_header(
    title = gt::md("**Table 8. Model_5: Stratified Model**")
  )
gt_table
Table 8. Model_5: Stratified Model
Characteristic HR1 95% CI1 p-value
rx


    Obs
    Lev 0.98 0.79, 1.22 0.9
    Lev+5FU 0.71 0.56, 0.89 0.003
age 1.01 1.00, 1.02 0.034
surg 1.30 1.06, 1.59 0.012
node4 2.50 2.06, 3.04 <0.001
local_spread


    contiguous
    muscle 0.34 0.21, 0.56 <0.001
    serosa 0.58 0.39, 0.84 0.004
    submucosa 0.24 0.08, 0.67 0.007
Concordance = 0.674
1 HR = Hazard Ratio, CI = Confidence Interval

Model_5 Cox proportional hazard assumption test

zph_test <- cox.zph(Model_5)
# Convert the Schoenfeld residuals test results to a data frame
zph_df <- as.data.frame(zph_test$table)
zph_df$Variable <- rownames(zph_df)
zph_df <- zph_df %>%
  mutate(
    chisq = round(chisq, 3),
    p = round(p, 3)
  )

zph_df %>%
  kbl(caption = "Table 9. Schoenfeld Residuals Test for Model_5") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = FALSE, position = "center") %>%
  row_spec(0, bold = TRUE, align = "center")
Table 9. Schoenfeld Residuals Test for Model_5
chisq df p Variable
rx 2.001 2 0.368 rx
age 0.670 1 0.413 age
surg 0.014 1 0.905 surg
node4 4.288 1 0.038 node4
local_spread 5.298 3 0.151 local_spread
GLOBAL 12.411 8 0.134 GLOBAL
# plot the Schoenfeld residuals
plot(zph_test)

Figure 5: Schoenfeld residal plot of variables included in Model_5

After model stratification by obstruct and differentiation, the proportional hazards assumption is met, as the global p-value is greater than 0.05. Node4 slightly violates the assumption, but the final model is not stratified by node4 because the model concordance is attenuated when stratifying by node4.

Evaluate multicollinearity of variables in Model_5

vif <- vif(Model_5)
print(vif)
                            GVIF Df GVIF^(1/(2*Df))
rx                      1.008066  2        1.002010
age                     1.019142  1        1.009525
surg                    1.008430  1        1.004206
strata(obstruct)        3.071784  0             Inf
strata(differentiation) 3.071784  0             Inf
node4                   1.023390  1        1.011628
local_spread            1.017073  3        1.002825

None of the variables has VIF >5, therefore multicollinearity is not a problem in Model_5

Model comparison

# Fit the models and store them in a list
models <- list(Model_1, Model_2, Model_3, Model_4, Model_5)

# Add descriptions for each model
descriptions <- c(
  "Base model",
  "Treatment",
  "Full variables",
  "Stepwise-selected variables",
  "Stratified"
)


# Create a data frame to store results
results <- data.frame(
  Model = character(),
  Description = character(),
  AIC = numeric(),
  BIC = numeric(),
  C_Index = numeric(),
  stringsAsFactors = FALSE
)

# Function to calculate and store metrics for each model
for (i in seq_along(models)) {
  model <- models[[i]]
  
  # Extract AIC and BIC
  aic <- AIC(model)
  bic <- BIC(model)
  
  # Add C-index
  c_index <- concordance(model)$concordance
  
  # Append results to the data frame
  results <- rbind(results, data.frame(
    Model = paste("Model", i),
    Description = descriptions[i],
    AIC = aic,
    BIC = bic,
    C_Index = round(c_index, 3)
  ))
}

# Print the table using kable and kableExtra
results %>%
  kbl(caption = "Table 10. Model Evaluation Metrics") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = FALSE, position = "center") %>%
  column_spec(1, bold = TRUE) %>%
  column_spec(2:5, width = "10em") %>%
  kable_styling(position = "center")
Table 10. Model Evaluation Metrics
Model Description AIC BIC C_Index
Model 1 Base model 5860.383 5860.383 0.500
Model 2 Treatment 5741.401 5798.993 0.674
Model 3 Full variables 5741.401 5798.993 0.674
Model 4 Stepwise-selected variables 5737.261 5782.511 0.672
Model 5 Stratified 4567.829 4600.739 0.674

Based on the model evaluation metrics, Model_5 has the smallest AIC and BIC values and achieved the highest concordance (c-index). Although Model_2 and Model_5 have the same concordance, Model_5 is the best model as indicated by the smaller AIC and BIC values compared to Model_2. Five-fold cross-validation was performed to test the robustness of Model_5.

K-fold cross validation

set.seed(1234)

# Cox model 
cox_model <- coxph(Surv(time, status) ~ rx + age + surg + strata(obstruct) + strata(differentiation) + node4 +
              local_spread, data = df)


# calculate the original c-index
c_index_original <- concordance(cox_model)
cat("original c-index:", c_index_original$concordance, "\n")
original c-index: 0.674316 
# create a function for calculating c-index in each fold using concordance()
cox_cindex <- function(train_data, test_data) {
  fit <- coxph(Surv(time, status) ~ rx + age + surg + strata(obstruct) + strata(differentiation) + node4 + local_spread, data = train_data)
  # Calculate concordance on test data
  c_index <- concordance(fit, newdata = test_data)$concordance
  
  return(c_index)
}

# perform 5-fold cross-validation with stratification
K <- 5
folds <- createFolds(c(df$status, df$differentiation, df$rx), k = K, list = TRUE, returnTrain = TRUE)
cv_c_indices <- sapply(folds, function(train_indices) {
  train_data <- df[train_indices, ]
  test_data <- df[-train_indices, ]
  cox_cindex(train_data, test_data) # use the concordance function inside cox_cindex
})

# Print cross-validated c-indices
print(cv_c_indices)
    Fold1     Fold2     Fold3     Fold4     Fold5 
0.7156051 0.6605045 0.6562016 0.6477106 0.6943820 
# cross-validation c-indices
cat("cross-validated c-Indices for each fold:", cv_c_indices, "\n")
cross-validated c-Indices for each fold: 0.7156051 0.6605045 0.6562016 0.6477106 0.694382 
cat("mean cross-validated c-Index:", mean(cv_c_indices), "\n")
mean cross-validated c-Index: 0.6748808 
# plot cross-validation c-indices
plot(cv_c_indices, type = "b", xlab = "Fold", ylab = "c-index", main = "c-index across folds")

Figure 6: Concordance across the 5 folds

Model_5’s c-index (0.674) and mean cross-validation c-index (0.675) are very similar, suggesting that the final stratified model is stable and not overfitting.

IV. Conclusions

  • Treatment with Levamisole + 5-FU decreases the hazard of death from colon cancer by 29.5% (p=0.003).

  • Having more than 4 tumor positive lymph nodes significantly increases the hazard of death by 150% (p<0.0001).

  • Compared to patients with tumors that have spread to contiguous organs, those with tumors localized in the submucosa, muscle, and serosa have significantly reduced hazard rates by 76% (p = 0.007), 66% (p < 0.001), and 43% (p = 0.004), respectively.

  • Having long wait from surgery to trial registration increases the hazard rate by about 29.5% (p=0.012), suggesting that starting chemotherapy treatment within a short period of time after surgery improves survival.

  • For each additional year of age, the hazard rate increases by about 0.87% (p = 0.034), suggesting that age has minimal effect on survival.

  • The concordance of the model (0.674) indicates moderate predictive accuracy for survival time.

  • Other variables that were not included in the study may contribute to survival time.

References

[1]
S. J. Walters, “Analyzing time to event outcomes with a cox regression model,” Wiley Interdiscip. Rev. Comput. Stat., vol. 4, no. 3, pp. 310–315, May 2012.
[2]
V. Patil and S. Dessai, “Testing and interpreting assumptions of COX regression analysis,” Cancer Res. Stat. Treat., vol. 2, no. 1, p. 108, 2019.
[3]
V. Backmann et al., “Comprehensive strain assessment and mortality after acute myocardial infarction: A retrospective observational study based on the essen coronary artery disease registry,” Heart, Sep. 2024.
[4]
Z. Zhang, J. Reinikainen, K. A. Adeleke, M. E. Pieterse, and C. G. M. Groothuis-Oudshoorn, “Time-varying covariates and coefficients in cox regression models,” Ann. Transl. Med., vol. 6, no. 7, pp. 121–121, Apr. 2018.
[5]
T. Therneau and P. Grambsch, Modeling survival data: Extending the cox model, 1st ed. in Statistics for biology and health. New York, NY: Springer, 2010.
[6]
D. G. Kleinbaum, “The cox proportional hazards model and its characteristics,” in Survival analysis, New York, NY: Springer New York, 1996, pp. 83–128.
[7]
T. Intrieri, G. Manneschi, and A. Caldarella, “10-year survival in female breast cancer patients according to ER, PR and HER2 expression: A cancer registry population-based analysis,” J. Cancer Res. Clin. Oncol., vol. 149, no. 8, pp. 4489–4496, Jul. 2023.
[8]
Y. Wu et al., “Effect of adjuvant chemotherapy on the survival outcomes of elderly breast cancer: A retrospective cohort study based on SEER database,” J. Evid. Based Med., vol. 15, no. 4, pp. 354–364, Dec. 2022.
[9]
A. Ptak-Chmielewska and J. P. E. Gonzalez, “Default prediction using the cox regression model and macroeconomic conditions – a lifetime perspective,” Econometrics, vol. 28, no. 2, pp. 50–61, 2024.
[10]
K. K.-K. Wong, “Using cox regression to model customer time to churn in the wireless telecommunications industry,” J. Target. Meas. Anal. Mark., vol. 19, no. 1, pp. 37–43, Mar. 2011.
[11]
M. I. Rodrı́guez-Borbón, M. A. Rodrı́guez-Medina, L. A. Rodrı́guez-Picón, A. Alvarado-Iniesta, and N. Sha, “Reliability estimation for accelerated life tests based on a cox proportional hazard model with error effect,” Qual. Reliab. Eng. Int., vol. 33, no. 7, pp. 1407–1416, Nov. 2017.
[12]
L. Cheng, W. K. Chan, L. Zhu, M. H. Chao, and Y. Wang, “Confronting inequalities and bridging the divide: A retrospective study assessment of country-level COVID-19 vaccine equality with a cox regression model,” Vaccines (Basel), vol. 12, no. 5, p. 552, May 2024.
[13]
T. M. Therneau, A package for survival analysis in r. 2024. Available: https://CRAN.R-project.org/package=survival
[14]
Terry M. Therneau and Patricia M. Grambsch, Modeling survival data: Extending the Cox model. New York: Springer, 2000.
[15]
C. G. Moertel et al., “Levamisole and fluorouracil for adjuvant therapy of resected colon carcinoma,” N. Engl. J. Med., vol. 322, no. 6, pp. 352–358, Feb. 1990.
[16]
A. Kassambara, M. Kosinski, and P. Biecek, Survminer: Drawing survival curves using ’ggplot2’. 2021. Available: https://CRAN.R-project.org/package=survminer
[17]
W. N. Venables and B. D. Ripley, Modern applied statistics with s, Fourth. New York: Springer, 2002. Available: https://www.stats.ox.ac.uk/pub/MASS4/
[18]
R Core Team, R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2023. Available: https://www.R-project.org/
[19]
A. C. Davison and D. V. Hinkley, Bootstrap methods and their applications. Cambridge: Cambridge University Press, 1997. Available: doi:10.1017/CBO9780511802843
[20]
Angelo Canty and B. D. Ripley, Boot: Bootstrap r (s-plus) functions. 2024.